PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population
نویسندگان
چکیده
Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.
منابع مشابه
Value of Mendelian laws of segregation in families: data quality control, imputation, and beyond.
When analyzing family data, we dream of perfectly informative data, even whole-genome sequences (WGSs) for all family members. Reality intervenes, and we find that next-generation sequencing (NGS) data have errors and are often too expensive or impossible to collect on everyone. The Genetic Analysis Workshop 18 working groups on quality control and dropping WGSs through families using a genome-...
متن کاملGene Dropping Analysis of Founder Contributions and Allelic Diversity in an under Selected Population of Japanese Quails
The current study was conducted to consider how pedigree and the gene-dropping analysis can use to monitor genetic diversity, and to recommend a breeding strategy for improving breast weight (BRW) in a population of Japanese quail. A total of 312 birds were divided equally into two lines. One line (S1) was selected for four-week body weight (BW) based on breeding value, and the other (S2) was s...
متن کاملIdentity-by-Descent-Based Phasing and Imputation in Founder Populations Using Graphical Models
Accurate knowledge of haplotypes, the combination of alleles co-residing on a single copy of a chromosome, enables powerful gene mapping and sequence imputation methods. Since humans are diploid, haplotypes must be derived from genotypes by a phasing process. In this study, we present a new computational model for haplotype phasing based on pairwise sharing of haplotypes inferred to be Identica...
متن کاملRecursive Long Range Phasing and Long Haplotype Library Imputation: Building a Global Haplotype Library for Holstein cattle
Long range phasing (LRP) is a fast and accurate rule based method which uses information from both related and unrelated individuals by invoking the concepts of surrogate parents and Erdös numbers (Kong et al., 2008). Recursive long range phasing and long haplotype imputation (RLRPLHI; Hickey et al., 2009) is an extended LRP algorithm with increased robustness partially due to the extra long ha...
متن کاملHaplotype Inference in Complex Pedigrees
Despite the desirable information contained in complex pedigree data sets, analysis methods struggle to efficiently process these data. The attractiveness of pedigree data is their power for detecting rare variants, particularly in comparison with studies of unrelated individuals. In addition, rather than assuming individuals in a study are unrelated, knowledge of their relationships can avoid ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 11 شماره
صفحات -
تاریخ انتشار 2015